Search Results
Improving the performance of small models with knowledge distillation
MedAI #88: Distilling Step-by-Step! Outperforming LLMs with Smaller Model Sizes | Cheng-Yu Hsieh
Knowledge Distillation with TAs
Knowledge Distillation: A Good Teacher is Patient and Consistent
Model Distillation: Same LLM Power but 3240x Smaller
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference
Qi Wu – Compress language models to effective & resource-saving models with knowledge distillation
Knowledge Distillation: The story of small language model learning from large teacher models
Better not Bigger: Distilling LLMs into Specialized Models
Gurtam DevConf 2024: How knowledge distillation inspires foundation models? | Veronika Suprunovich
EfficientML.ai Lecture 9 - Knowledge Distillation (MIT 6.5940, Fall 2023)
[2024 Best AI Paper] Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Samplin